GCNU TR 1999 – 002 Linear

نویسنده

Carlos Brody

چکیده

This note gives a closed form expression for the linear transform computed by an optimally trained linear heteroencoder network of arbitrary topology trained to minimize squared error. The transform can be thought of as a restricted rank version of the basic linear least-squares regression (discrete Wiener filter) between input and output. The rank restriction is set by the “bottleneck” size of the network – the minimum number of hidden units in any layer. A special case of this expression is the well known result that linear autoencoders with a bottleneck of size r perform a transform equivalent to projecting into the subspace spanned by the first r principal components of the data. This result eliminates the need to explicitly train linear heteroencoder networks. Linear Heteroencoders Sam Roweis Gatsby Unit Carlos Brody Computation and Neural Systems California Institute of Technology 1 Linear autoencoders Because they are fast to train and require few parameters, linear networks1 provide an important performance comparison with more complex data analysis methods. The equivalent linear transform computed by a linear autoencoder networkwith a bottleneck of size r has long been known to be the projection into the subspace spanned by the first r principal components of the training data. This is true if the training algorithm minimizes squared error at the output and achieves the global minimum of that error. Furthermore, Bourlard and Kamp (1988) have shown that if all layers of the network after the bottleneck are linear the optimal transformation remains unchanged even if nonlinear transfer functions are added to units before the bottleneck. In other words, the result still holds for networks that are merely output-linear. These results are important because they eliminate the need to explicitly train linear or output-linear autoencoders. Algorithms such as the singular-value decomposition can be used to quickly compute the optimal transform. These algorithms run in a known fixed time, results are easily reproducible, they are guaranteed to achieve the global minimum of error, and they produce an ordered set of orthogonal eigenvectors. In this note, we present the analogous results for linear (and output-linear) heteroencoder networkswith a bottleneck. Enforcing such a bottleneckmay be important in cases where high dimensionality of the input space leads to overtraining and poor generalization. The results below allow the equivalent transform (and thus performance) of an optimally trained network to be easily computed without explicit training. Many previous authors [Baldi and Hornik, 1989, Kung and Diamantaras, 1991] [Diamantaras and Kung, 1994, Scharf, 1991, Stoica and Viberg, 1996, Ghahramani, 1996] have proved portions of the results we review below. However, the proofs are often part of a more detailed or lengthy discussion and as a result are sometimes mathematically more complex. The goal of this note is to provide a short and simple exposition of these results along with practical expressions for their implementation. y Based on a May 1997 version. During writing, the authors were supported in part by the Center for Neuromorphic Systems Engineering as a part of the National Science Foundation Engineering Research Center Program under grant EEC-9402726 and by the Natural Sciences and Engineering Research Council of Canada under an NSERC 1967 Award. We refer to any network in which the transfer function of every unit is linear as a linear network. Similarly we use the term linear layer for layers in which every unit’s transfer function is a linear function. A network has a bottleneck of size r if no layer has fewer than r units. A network has no bottleneck if all layers have at least as many units as both the input dimension and output dimension. If all layers after the bottleneck layer (or after the last bottleneck if there are several) are linear, we call the network output-linear.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Leishmanicidal activity of two naphthoquinones against Leishmania donovani.

Here we studied ability of two naphthoquinones to inhibit Leishmania growth (2,3-dichloro-5,8-dihydroxy-1,4-naphthoquinone (TR 001) and 2,3-dibromo-1,4-naphthoquinone (TR 002). TR 001 was more efficient than TR 002 in inducing killing of promastigotes and intracellular amastigotes. These values compare well to those obtained with the standard first-line antileishmanial agent sodium stiboglucona...

متن کامل

Spring 2002 Honors Classes

13353 BI-H-1013-001 HONORS BIOL OF THE CELL U 3 GRIPPO A MTG1-TR 0930AM-1100AM LSE 207 10990 EN-H-1013-001 HONORS FR ENGLISH II U 3 MCGHEE R MTG1-MWF 0900AM-0950AM W 315 10991 EN-H-1013-002 HONORS FR ENGLISH II U 3 SCHICHLER R MTG1-TR 1100AM-1215PM LIB 103B 10992 EN-H-1013-003 HONORS FR ENGLISH II U 3 CHAPPEL D MTG1-MWF 1200PM-1250PM LIB 103B 10993 EN-H-2013-001 HNRS INTRO LIT II U 3 COLLINS J ...

متن کامل

On a functional equation for symmetric linear operators on $C^{*}$ algebras

‎Let $A$ be a $C^{*}$ algebra‎, ‎$T‎: ‎Arightarrow A$ be a linear map which satisfies the functional equation $T(x)T(y)=T^{2}(xy),;;T(x^{*})=T(x)^{*} $‎. ‎We prove that under each of the following conditions‎, ‎$T$ must be the trivial map $T(x)=lambda x$ for some $lambda in mathbb{R}$: ‎‎ ‎i) $A$ is a simple $C^{*}$-algebra‎. ‎ii) $A$ is unital with trivial center and has a faithful trace such ...

متن کامل

- m at . m tr l - sc i ] 1 1 Ju n 19 99 Dynamics of Nonequilibrium Deposition

In this work we survey selected theoretical developments for models of deposition of extended particles, with and without surface diffusion, on linear and planar substrates, of interest in colloid, polymer, and certain biological systems. – 1 –

متن کامل